Search CORE

6 research outputs found

Enabling Program Analysis Through Deterministic Replay and Optimistic Hybrid Analysis

Author: Devecsery David
Publication venue
Publication date: 01/01/2018
Field of study

As software continues to evolve, software systems increase in complexity. With software systems composed of many distinct but interacting components, today’s system programmers, users, and administrators find themselves requiring automated ways to find, understand, and handle system mis-behavior. Recent information breaches such as the Equifax breach of 2017, and the Heartbleed vulnerability of 2014 show the need to understand and debug prior states of computer systems. In this thesis I focus on enabling practical entire-system retroactive analysis, allowing programmers, users, and system administrators to diagnose and understand the impact of these devastating mishaps. I focus primarly on two techniques. First, I discuss a novel deterministic record and replay system which enables fast, practical recollection of entire systems of computer state. Second, I discuss optimistic hybrid analysis, a novel optimization method capable of dramatically accelerating retroactive program analysis. Record and replay systems greatly aid in solving a variety of problems, such as fault tolerance, forensic analysis, and information providence. These solutions, however, assume ubiquitous recording of any application which may have a problem. Current record and replay systems are forced to trade-off between disk space and replay speed. This trade-off has historically made it impractical to both record and replay large histories of system level computation. I present Arnold, a novel record and replay system which efficiently records years of computation on a commodity hard-drive, and can efficiently replay any recorded information. Arnold combines caching with a unique process-group granularity of recording to produce both small, and quickly recalled recordings. My experiments show that under a desktop workload, Arnold could store 4 years of computation on a commodity 4TB hard drive. Dynamic analysis is used to retroactively identify and address many forms of system mis-behaviors including: programming errors, data-races, private information leakage, and memory errors. Unfortunately, the runtime overhead of dynamic analysis has precluded its adoption in many instances. I present a new dynamic analysis methodology called optimistic hybrid analysis (OHA). OHA uses knowledge of the past to predict program behaviors in the future. These predictions, or likely invariants are speculatively assumed true by a static analysis. This creates a static analysis which can be far more accurate than its traditional counterpart. Once this predicated static analysis is created, it is speculatively used to optimize a final dynamic analysis, creating a far more efficient dynamic analysis than otherwise possible. I demonstrate the effectiveness of OHA by creating an optimistic hybrid backward slicer, OptSlice, and optimistic data-race detector OptFT. OptSlice and OptFT are just as accurate as their traditional hybrid counterparts, but run on average 8.3x and 1.6x faster respectively. In this thesis I demonstrate that Arnold’s ability to record and replay entire computer systems, combined with optimistic hybrid analysis’s ability to quickly analyze prior computation, enable a practical and useful entire system retroactive analysis that has been previously unrealized.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144052/1/ddevec_1.pd

Deep Blue Documents at the University of Michigan

Zero Knowledge for Everything and Everyone: Fast ZK Processor with Cached RAM for ANSI C Programs

Author: David Devecsery
David Heath
Vladimir Kolesnikov
Yibin Yang
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 21/06/2022
Field of study

We build a complete and efficient ZK toolchain that handles proof statements encoded as arbitrary ANSI C programs. Zero-Knowledge (ZK) proofs are foundational in cryptography. Recent ZK research has focused intensely on non-interactive proofs of small statements, useful in blockchain scenarios. We instead target large statements that are useful, e.g., in proving properties of programs. Recent work (Heath and Kolesnikov, CCS 2020 [HK20a]) designed a proof-of-concept ZK machine (ZKM). Their machine executes arbitrary programs over a minimal instruction set, authenticating in ZK the program execution. In this work, we significantly extend this research thrust, both in terms of efficiency and generality. Our contributions include: • A rich and performance-oriented architecture for representing arbitrary ZK proofs as programs. • A complete compiler toolchain providing full support for ANSI C95 programs. We ran off-the-shelf buggy versions of sed and gzip, proving in ZK that each program has a bug. To our knowledge, this is the first ZK system capable of executing standard Linux programs. • Improved ZK RAM. [HK20a] introduced an efficient ZK-specific RAM BubbleRAM that consumes

O(\log^2 n)

communication per access. We extend BubbleRAM with multi-level caching, decreasing communication to

O(\log n)

per access. This introduces the possibility of a cache miss, which we handle cheaply. Our experiments show that cache misses are rare; in isolation, i.e., ignoring other processor costs, BubbleCache improves communication over BubbleRAM by more than

8\times

. Using BubbleCache improves our processor’s total communication (including costs of cache misses) by

\approx 25-30

%. • Numerous low-level optimizations, resulting in a CPU that is both more expressive and

\approx 5.5\times

faster than [HK20a]’s. • Attention to user experience. Our engineer-facing ZK instrumentation and extensions are minimal and easy to use. Put together, our system is efficient and general, and can run many standard Linux programs. The resultant machine runs at up to 11KHz on a 1Gbps LAN and supports MBs of RAM

Cryptology ePrint Archive

EZEE: Epoch Parallel Zero Knowledge for ANSI C

Author: David Devecsery
David Heath
Vladimir Kolesnikov
Yibin Yang
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 21/06/2022
Field of study

Recent work has produced interactive Zero Knowledge (ZK) proof systems that can express proofs as arbitrary C programs (Heath et al., 2021, henceforth referred to as ZEE); these programs can be executed by a simulated ZK processor that runs in the 10KHz range. In this work, we demonstrate that such proof systems are amenable to high degrees of parallelism. Our epoch parallelism-based approach allows the prover and verifier to divide the ZK proof into pieces such that each piece can be executed on a different machine. These proof snippets can then be glued together, and the glued parallel proofs are equivalent to the original sequential proof. We implemented and we experimentally evaluate an epoch parallel version of the ZEE proof system. By running the prover and verifier each across 31 2-core machines, we achieve a ZK processor that runs at up to 394KHz. This allowed us to run a benchmark involving the Linux program bzip2, which would have required at least 11 days with the former ZEE system, in only 8.5 hours

Cryptology ePrint Archive

Parallelizing Data Race Detection

Author: Benjamin Wester
David Devecsery
Jason Flinn
Peter M. Chen
Satish Narayanasamy
Publication venue
Publication date: 01/01/2013
Field of study

Detecting data races in multithreaded programs is a crucial part of debugging such programs, but traditional data race detectors are too slow to use routinely. This paper shows how to speed up race detection by spreading the work across multiple cores. Our strategy relies on uniparallelism, which executes time intervals of a program (called epochs) in parallel to provide scalability, but executes all threads from a single epoch on a single core to eliminate locking overhead. We use several techniques to make parallelization effective: dividing race detection into three phases, predicting a subset of the analysis state, eliminating sequential work via transitive reduction, and reducing the work needed to maintain multiple versions of analysis via factorization. We demonstrate our strategy by parallelizing a happens-before detector and a lockset-based detector. We find that uniparallelism can significantly speed up data race detection. With 4 × the number of cores as the original application, our strategy speeds up the median execution time by 4.4 × for a happens-before detector and 3.3 × for a lockset race detector. Even on the same number of cores as the conventional detectors, the ability for uniparallelism to elide analysis locks allows it to reduce the median overhead by 13 % for a happens-before detector and 8 % for a lockset detector

CiteSeerX

Crossref

Parallelizing data race detection

Author: Benjamin Wester
Chung J.
David Devecsery
Devietti J.
Erickson J.
Fidge C. J.
Jason Flinn
Mattern F.
Peter M. Chen
Poulsen K.
Satish Narayanasamy
Zilles C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Sparse record and replay with controlled scheduling

Author: Alpern Bowen
Batty Mark
Burg Brian
Cui Heming
David
Devecsery David
Dunlap George W.
Dunlap George W.
Engblom Jakob
Flanagan Cormac
Geels Dennis
Gupta Rajiv
Hower Derek
Huang Jeff
Huang Shiyou
James
Koning Koen
Lee Kyu Hyung
Lidbury Christopher
Narayanasamy Satish
O'Callahan Robert
Srinivasan Deepa
Volckaert Stijn
Wobber Ted
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/06/2019
Field of study

Modern applications include many sources of nondeterminism, e.g. due to concurrency, signals, and system calls that interact with the external environment. Finding and reproducing bugs in the presence of this nondeterminism has been the subject of much prior work in three main areas: (1) controlled concurrency-testing, where a custom scheduler replaces the OS scheduler to find subtle bugs; (2) record and replay, where sources of nondeterminism are captured and logged so that a failing execution can be replayed for debugging purposes; and (3) dynamic analysis for the detection of data races. We present a dynamic analysis tool for C++ applications, tsan11rec, which brings these strands of work together by integrating controlled concurrency testing and record and replay into the tsan11 framework for C++11 data race detection. Our novel twist on record and replay is a sparse approach, where the sources of nondeterminism to record can be configured per application. We show that our approach is effective at finding subtle concurrency bugs in small applications; is competitive in terms of performance with the state-of-the-art record and replay tool rr on larger applications; succeeds (due to our sparse approach) in replaying the I/O-intensive Zandronum and QuakeSpasm video games, which are out of scope for rr; but (due to limitations of our sparse approach) cannot faithfully replay applications where memory layout nondeterminism significantly affects application behaviour

Crossref

Spiral - Imperial College Digital Repository